56 research outputs found

    Unsupervised segmentation of bibliographic elements with latent permutations

    Get PDF
    This paper introduces a novel approach for large-scale unsupervised segmentation of bibliographic elements. Our problem is to segment a word token sequence representing a citation into subsequences each corresponding to a different bibliographic element, e.g. authors, paper title, journal name, publication year, etc. Obviously, each bibliographic element should be represented by contiguous word tokens. We call this constraint contiguity constraint. Therefore, we should infer a sequence of assignments of word tokens to bibliographic elements so that this constraint is satisfied. Many HMM-based methods solve this problem by prescribing fixed transition patterns among bibliographic elements. In this paper, we use generalized Mallows models (GMM) in a Bayesian multi-topic model, effectively applied to document structure learning by Chen et al. [4], and infer a permutation of latent topics each of which can be interpreted as one among the bibliographic elements. According to the inferred permutation, we arrange the order of the draws from a multinomial distribution defined over topics. In this manner, we can obtain an ordered sequence of topic assignments satisfying contiguity constraint. We do not need to prescribe any transition patterns among bibliographic elements. We only need to specify the number of bibliographic elements. However, the method proposed by Chen et al. works for our problem only after introducing modification. The main contribution of this paper is to propose strategies to make their method work also for our problem.Workshops on Web Information Systems Engineering, WISE 2010: 1st International Symposium on Web Intelligent Systems and Services, WISS 2010, 2nd International Workshop on Mobile Business Collaboration, MBC 2010 and 1st Int. Workshop on CISE 2010; Hong Kong; 12 December 2010 through 14 December 2010; Code 8680

    Pattern Compression of FAST Corner Detection for Efficient Hardware Implementation

    Get PDF
    This paper shows stream-oriented FPGA implementation of the machine-learned Features from Accelerated Segment Test (FAST) corner detection, which is used in the parallel tracking and mapping (PTAM) for augmented reality (AR). One of the difficulties of compact hardware implementation of the FAST corner detection is a matching process with a large number of corner patterns. We propose corner pattern compression methods focusing on discriminant division and pattern symmetry for rotation and inversion. This pattern compression enables implementation of the corner pattern matching with a combinational circuit. Our prototype implementation achieves real-time execution performance with 7∼9% of available slices of a Virtex-5 FPGA.2011 International Conference on Field Programmable Logic and Applications (FPL) : Chania, Greece, 2011.09.5-2011.09.

    Steering time-dependent estimation of posteriors with hyperparameter indexing in Bayesian topic models

    Get PDF
    This paper provides a new approach to topical trend analysis. Our aim is to improve the generalization power of latent Dirichlet allocation (LDA) by using document timestamps. Many previous works model topical trends by making latent topic distributions time-dependent. We propose a straightforward approach by preparing a different word multinomial distribution for each time point. Since this approach increases the number of parameters, overfitting becomes a critical issue. Our contribution to this issue is two-fold. First, we propose an effective way of defining Dirichlet priors over the word multinomials. Second, we propose a special scheduling of variational Bayesian (VB) inference. Comprehensive experiments with six datasets prove that our approach can improve LDA and also Topics over Time, a well-known variant of LDA, in terms of test data perplexity in the framework of VB inference

    Bag of Timestamps: A Simple and Efficient Bayesian Chronological Mining

    Get PDF
    In this paper, we propose a new probabilistic model, Bag of Timestamps (BoT), for chronological text mining. BoT is an extension of latent Dirichlet allocation (LDA), and has two remarkable features when compared with a previously proposed Topics over Time (ToT), which is also an extension of LDA. First, we can avoid overfitting to temporal data, because temporal data are modeled in a Bayesian manner similar to word frequencies. Second, BoT has a conditional probability where no functions requiring time-consuming computations appear. The experiments using newswire documents show that BoT achieves more moderate fitting to temporal data in shorter execution time than ToT.Advances in Data and Web Management. Joint International Conferences, APWeb/WAIM 2009 Suzhou, China, April 2-4, 2009 Proceeding

    A Fast Runtime Visualization of a GPU-Based 3D-FDTD Electromagnetic Simulation

    Get PDF
    In this paper, we present design and implementation of a fast runtime visualizer for a GPU-based 3D-FDTD electromagnetic simulation. We focus on improving the productivity of simulator development without compromising simulation performance. In order to keep the portability, we implemented a visualizer with the MVC model, where simulation kernels and visualization process were completely separated. For high-speed visualization, an interoperability mechanism between OpenGL and CUDA was used in addition to efficient utilization of programmable shaders. We also propose an asynchronous multi-threaded execution with a triple-buffering technique so that developers can concentrate on developing their simulation kernels. As a result of empirical visualization experiments of electromagnetic simulations for practical antenna design, it was revealed that our implementation achieved a rendering throughput of 90 FPS for a view port of 512 x 512 pixels, which corresponds to a 12.9 times speedup compared to when the OpenGL-CUDA interoperability mechanism was not utilized. When a standard visualization throughput of 60 FPS was selected, the performance overhead imposed by the visualization process was 15.8%, which was reasonably low compared to a speedup of the simulation kernel gained by the GPU acceleration

    A soft-core processor for finite field arithmetic with a variable word size accelerator

    Get PDF
    This paper presents implementation and evaluation of an accelerator architecture for soft-cores to speed up reduction process for the arithmetic on GF(2m) used in Elliptic Curve Cryptography (ECC) systems. In this architecture, the word size of the accelerator can be customized when the architecture is configured on an FPGA. Focusing on the fact that the number of the reduction processing operations on GF(2m) is affected by the irreducible polynomial and the word size, we propose to employ an unconventional word size for the accelerator depending on a given irreducible polynomial and implement a MIPS-based soft-core processor coupled with a variable-word size accelerator. As a result of evaluation with several polynomials, it was shown that the performance improvement of up to 10.2 times was obtained compared to the 32-bit word size, even taking into account the maximum frequency degradation of 20.4% caused by changing the word size. The advantage of using unconventional word sizes was also shown, suggesting the promise of this approach for low-power ECC systems.24th International Conference on Field Programmable Logic and Applications, FPL 2014; Technische Universitat MunchenMunich; Germany; 1 September 2014 through 5 September 201

    Modeling topical trends over continuous time with priors

    Get PDF
    In this paper, we propose a new method for topical trend analysis. We model topical trends by per-topic Beta distributions as in Topics over Time (TOT), proposed as an extension of latent Dirichlet allocation (LDA). However, TOT is likely to overfit to timestamp data in extracting latent topics. Therefore, we apply prior distributions to Beta distributions in TOT. Since Beta distribution has no conjugate prior, we devise a trick, where we set one among the two parameters of each per-topic Beta distribution to one based on a Bernoulli trial and apply Gamma distribution as a conjugate prior. Consequently, we can marginalize out the parameters of Beta distributions and thus treat timestamp data in a Bayesian fashion. In the evaluation experiment, we compare our method with LDA and TOT in link detection task on TDT4 dataset. We use word predictive probabilities as term weights and estimate document similarities by using those weights in a TFIDF-like scheme. The results show that our method achieves a moderate fitting to timestamp data.Advances in Neural Networks - ISNN 2010 : 7th International Symposium on Neural Networks, ISNN 2010, Shanghai, China, June 6-9, 2010, Proceedings, Part IIThe original publication is available at www.springerlink.co

    Dynamic hyperparameter optimization for bayesian topical trend analysis

    Get PDF
    This paper presents a new Bayesian topical trend analysis. We regard the parameters of topic Dirichlet priors in latent Dirichlet allocation as a function of document timestamps and optimize the parameters by a gradient-based algorithm. Since our method gives similar hyperparameters to the documents having similar timestamps, topic assignment in collapsed Gibbs sampling is affected by timestamp similarities. We compute TFIDF-based document similarities by using a result of collapsed Gibbs sampling and evaluate our proposal by link detection task of Topic Detection and Tracking.Proceeding of the 18th ACM conference : Hong Kong, China, 2009.11.02-2009.11.0

    Pipeline scheduling with input port constraints for an FPGA-based biochemical simulator

    Get PDF
    This paper discusses design methodology of high-throughput arithmetic pipeline modules for an FPGA-based biochemical simulator. Since limitation of data-input bandwidth caused by port constraints often has a negative impact on pipeline scheduling results, we propose a priority assignment method of input data which enables efficient arithmetic pipeline scheduling under given input port constraints. Evaluation results with frequently used rate-law functions in biochemical models revealed that the proposed method achieved shorter latency compared to ASAP and ALAP scheduling with random input orders, reducing hardware costs by 17.57% and by 27.43% on average, respectively.The original publication is available at www.springerlink.co
    • …
    corecore